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Abstract 

In this paper we discuss an empirical phenomena known as the 20-60-20 rule. It says that if we 
split the population into three groups, according to some arbitrary benchmark criterion, then this 
particular ratio implies some sort of balance. From practical point of view, this feature often leads to 
efficient management or control. We provide a mathematical illustration, justifying the occurrence of 
this rule in many real world situations. We show that for any population, which could be described 
using multivariate normal vector, this fixed ratio leads to a global equilibrium state, when dispersion 
and linear dependance measurement is considered. 
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Introduction 

The 20-60-20 rule is an empirical statement. It says that if we want to split the population into three 
groups, using some arbitrary benchmark criterion, then the ratio of 20%, 60% and 20% proves to give 
an efficient partition. The division is usually made according to the performance of each element in the 
population and the groups are referred to as negative, neutral and positive, respectively. The first group 
relates to elements of population which positively contribute to the considered subject (e.g. effective 
workers, top sale managers, productive members), while the last one denotes the opposite. The middle 
set corresponds to the middle part of the population, having average performance. Putting it another 
way we cluster the population basing on a notion of effectiveness. 

The importance of this rule comes from the fact that this particular partition seems to be the most 
effective one, for many empirical problems. Let us present in details two common illustrations of this 
phenomena and then comment on the efficiency to make this idea more transparent. 

The first example considers sales departments. In almost any big company, the employees of the 
sales department could be split into three groups, maintaining 20-60-20 ratio. The first group are top 
performers, who make big profits, even without supervision. The middle group are people who need to 
be managed to make average but stable profits. The last group are people who are heading towards 
termination or resignation. They produce no good income, even when supervised. 

The second example relates to change capability. If you are willing to make substantial changes in 
any big institution, then on average 20% of the people are ready, willing and able to change, while 20% 
of people would not accept the change, whatever the cost. The middle 60% will wait to see how the 
situation turns out. 

Corporations use the 20-60-20 rule widely in management and sales departments [15, 13]. One of the 
practical aspects of this phenomena relates to the fact that different procedures and methods are created 
to handle the efficiency in positive, negative and neutral group and the 20-60-20 ratio proves to be the 
most efficient partition. For example, in many problems related to human resource management, one 
should identify and focus his attention on the middle 60%, as this group could and should be managed 
efficiently. 

* Institute of Mathematics, University of Warsaw, Warsaw, Poland. 

'Institute of Mathematics, Jagiellonian University, Cracow, Poland. 


1 



Of course there are countless illustrations of this phenomena. One could consider financial market 
overall condition, fraud and theft capability among group of people, the structure of electorate, sport 
performance among athletes, potential of students, patient handling, medical treatments, etc. Please see 
e.g. [14, 3, 1, 5, 7, 8, 11, 2, 4], where the 20-60-20 ratio is used and the detailed procedures are proposed 
to handle many practical problems. 

The natural question is why this specific 20/60/20 ratio is valid in so many situations? Why not 
10/80/10 or 30/40/30? Is this a coincidence, or does it follow from some underlying and fundamental 
structure of the population? 

While very popular among practitioners, no scientific evidence of the 20-60-20 principle has been 
presented yet, due to the authors knowledge. Consequently, this noteworthy rule become more of a 
slogan, than the scientific fact. 

The possible mathematical illustration of this phenomena, based on the dispersion and linear depen- 
dance measurement will be the main topic of this paper. We will show that if a (multivariate) random 
vector is distributed normally and we do conditioning based on the (quantile function of) first coordinate, 
then the ratio close to 20/60/20 imply a global equilibrium state, when dispersion and linear dependance 
measurement is considered. In particular, we prove that this particular partition implies the equality 
of covariance matrices, for all conditional vectors, implying some sort of global balance in the popula¬ 
tion. We will also discuss the case of monotone dependance using conditional Kendall r and Spearman 
p matrices. 

The material is organized as follows. The introduction is followed by a short preliminaries, where we 
establish basic notations used throughout this paper. Next, in Section 2 we introduce a mathematical 
model for the 20-60-20 rule and define the equilibrium state , using conditional covariance matrices. The 
20-60-20 rule for multivariate normal vectors is discussed in Section 3. Theorem 1 might be considered as 
the main result of this paper. Section 4 is devoted to the study of different equilibrium states, obtained 
using correlation matrices, Kendall r matrices and Spearman p matrices. In particular we present here 
some theoretical results, when Spearman p matrices are considered and a numerical example, illustrating 
the 20-60-20 rule for sample data. In Section 5 we discuss shortly what happens if we loose the assumption 
about normality. The general elliptic case is considered here. 


1 Preliminaries 

Let (fl, E,P) be a probability space and let n £ N. Let us fix an n-dimensional continuous random vector 
X = (Xi,... ,X n ). We will use 

H{x 1 , . . . , X n ) .— P[Ah X 1 , . . . X n ^ Xn\ , 
to denote the corresponding joint distribution function and 

Hi(x) =¥[Xi < x], i = 1,2,... ,n, 

to denote the marginal distribution functions. Given a Borel set B in R" such that 

P[{w £ Q : (Xi(co), ..., X n (u>)) £ B}\ > 0 
we can define the conditional distribution H B for all (aq,..., x n ) £ B by 


H b (x i, ...,x n )= P[X x < Xi, ... ,X n < x n | X £ B}. 


(1) 


Putting it in another words we truncate the random vector X to the Borel set B. If necessary, we assume 
the existence of regular conditional probabilities. In this paper we will assume that B is a non-degenerate 
rectangle, i.e. B €lZ, where 

1Z := {A £ R" : A = [aq, b\] x [a 2 , b 2 ] x ... x [ a n , b n \, where a„, b n £ K and a n < b n }. 

As we will be mainly interested in quantile-based conditioning on the first coordinate, for q\,q 2 £ [0,1] 
such that qi < q 2 , we shall use notation 


H [ qij?2 ] ( X \, ■ • ■, x n ) :— ,g 2 )(sq, • ■ •, x n ), 


( 2 ) 
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where the conditioning set is given by 

B(qi,q 2 ) := [H^ 1 (q 1 ),H^ 1 (q 2 )] xlx...xl. 

We shall also refer to Ht qi g2 ] as the truncated distribution , while 13(q±, q 2 ) will be called truncation interval 
(see [10]). 

Moreover, we will denote by fi = (hi, ..., fj, n ) and E = {0jj}*,j=i,...,n> the mean vector and covariance 
matrix of X. Similarly as in formula (1), given B , we will use /ig and Eg to denote the conditional mean 
vector and the conditional covariance matrix, i.e. mean vector and conditional covariance matrix of a 
random vector with distribution Hg. Consequently, as in (2), we shall write 

^[ 91 , 92 ] / Jj B(qi,q 2 ) an h ^[ 91 , 92 ] ^-' 6 ( 91 , 92 )' 

We will also use $ and <f> to denote the distribution and density function of a standard univariate normal 
distribution, respectively. 

2 The global balance 

To split the whole population into three separate groups basing on a notion of effectiveness, we need to 
make an assumption about the probability distribution of the whole population and the given benchmark, 
which measures the effectiveness of each element in the population. We will assume that X ~ E), 

i.e. the population could be described using n-dimensional random vector X = (Xi,... ,X n ), which is 
normally distributed with mean vector fi and covariance matrix E. Furthermore, we will assume that 
the benchmark level is determined by the first coordinate, i.e. X\. Please note that for multivariate 
normal this may be a linear combination of all other coordinates. One could look at other coordinates as 
various factors, which could influence the main benchmark. Note that, if we talk about people measures 
or abilities, then Gaussian functions, often described as bell curves, are a natural choice. 

We will seek for two real numbers q±, q 2 £ [0,1] and the corresponding partition 

B(0,qi), B(qi, 1 — q 2 ), B(l-q 2 ,l), 

which will admit some sort of equilibrium. In other words, we want to divide the whole population into 
three subgroups, corresponding to the lower 100gi%, the middle 100(1 — <?i — q 2 )% and the upper 100^2% 
of the population, where the effectiveness is measured by the benchmark. To do so, let us give a definition 
of equilibrium state or global balance. 

Definition 1. We will say that a global balance (or equilibrium state) is achieved in X if 

^[0,9l] = ^[91.1-92] = ^[1-92,1]! (3) 

for some qi,q 2 £ [0,1], such that qi < q 2 . 

Definition 1 seems to be very intuitive. Indeed, the equality of conditional covariance matrices say 
that: 

1. The dispersion measured by variance is the same in each subgroup for any coordinate Xi (for 
i = 1,2,..., n). In particular the dispersion of the benchmark is the same everywhere. 

2. The linear dependance structure, measured by the conditional correlation matrices, is the same in 
all three subgroups. 

The first property creates a natural equilibrium state, as any perturbation leads to irregularity, when 
the square distance from the average member of each group is considered. The choice of this measure of 
dispersion seems to be natural, because people awareness of any differences should be high, as variance 
(or standard deviation) seems to be the simplest measure of variability. 

The second property relate to the linear dependence structure. The equality of correlation matrices 
imply a natural equilibrium between groups, as people tend to notice the simplest (linear) dependancies 
first. Any shift between groups will cause dependence instability between them. 

In general (i.e. when we loose assumption about normality) the global balance might not exists or 
strongly depend on initial E, when we consider some family parametrised by covariance matrices. 
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3 The 20/60/20 principle 

If X is a multivariate normal, it is reasonable to set q\ = <72, due to the symmetry of the Gaussian 
density. For simplicity we will use q = q\ = (72 for the symmetric case. Thus, we will in fact seek for 
q £ (0, 0.5) such that the conditional covariance matrix for the lower 100g% of the population coincide 
with the conditional covariance matrices of the middle 100(1 — 2 q)% and upper 100(7%. 

We are now ready to present the main result of this paper. We will show that if X ~ N(p, E), then 
the equilibrium state will be achieved for the unique q £ (0, 0.5). This is a statement of Theorem 1. 

Theorem 1. Let X ~ E). Then there exists a unique q £ (0,0.5) such that the global balance in 

X is achieved, i.e. the equality (3) is true for q = q\ = q 2 - Moreover, the value of q is independent of p 
and E and the approximate value of q is 0,198089616... 

The proof of Theorem 1 is surprisingly simple. It is a direct consequence of Lemma 1 and Lemma 2, 
which we will now present and prove. Before we do this, let us give a comment on Theorem 1. It says that 
if we split the whole population, into three separate groups, then the ratio close to 20-60-20 (and in fact 
only this ratio), will imply the equality of conditional covariance matrices for all groups, creating a natural 
equilibrium. To prove Theorem 1 we need an analytic formula for conditional covariance structure, given 
any conditioning Borel set B of positive measure. This will be the statement of Lemma 1. 

Lemma 1. Let X ~ N{p, E). Then for any Borel subset B o/R with positive measure, 

Eg = E + (D^X, | X 1 £B}~ D 2 [X 1 ])0P t , 


where 


P T = 


CovlX^Xi] 
D 2 [ X,] 


Proof of Lemma 1. Being in Gaussian world we can describe each random variable Xj as a combination 
of the random variable X\ and a random variable Y) independent of X\. Indeed, we put for i = 1 ,... ,n 


Cov 

Yi = Xi- PiX 1, where Pi = — —. (4) 

D l x i\ 

Obviously Pi = 1 and Y\ = 0. Since for i = 2 ,... ,n, the newly defined variable Y t is uncorrelated with 
X\, they are independent. 

Next, we calculate the conditional covariance matrix. Using (4), we get for i,j = 1 ,... ,n 


CovlX^Xj \X 1 £B} = Cov[piX 1 + Y i ,p j X 1 + Y 3 \ X\ £ B}. 


Since Yi and Y) do not dependent on X\, we get 

Cov[Yi,X 1 | Xi g B] = 0 = Cov[Y j ,X 1 \ X 1 £ B], 


and 

Cov[Yi,Y 3 | X 1 £B} = Cov^Yj] = Cov[Xi,X 3 ] - p i p 3 D 2 [X 1 \. 

Therefore, we obtain 


Cov[Xi,X 3 | Xi £B}= Cov[Xi,X 3 ] + PiP 3 (D 2 [Xi \ X 1 £ B] - D 2 [Xi]). 

Since pip 3 is the *,j-th entry of the n x n matrix PP T , we finish the proof of the lemma. □ 

From Lemma 1 we see, that we can parametrise Eg in such a way, that it will only depend on the 
conditional variance of X\. Thus, we only need to show that there exists q £ (0,0.5) such that the 
(conditional) dispersion of X\ in all three groups, determined by sets B( 0, q), B(q, 1 — q) and B( 1 — q, 1) 
will coincide. This will be the statement of Lemma 2. 
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Lemma 2. Let X\ ~ A/"(/zi, o fi). Then there exist a unique q £ (0, 0.5) such that 

D 2 [X! I X x £ B(0,g)] = D 2 [X! I X, £ B(g, 1 - q)] = D 2 [X x \ X x £ 25(1 - q, 1)]. 

Moreover, q = <f>(x), where x < 0 is the unique negative solution of the following equation 

— x<I>(x) = </>(x)(l — 2<h(x)), (5) 

where <f> and $ denote the density and distribution function of standard normal, respectively. The ap¬ 
proximate value of q is 19,8089616.... 

Proof of Lemma 2. Without any loss of generality we may assume that X\ has the standard normal 
distribution Af( 0, 1 ). Indeed, for X * 4 = Xl a L^ 1 , and <71,92 £ [0, 1 ], such that 91 < 92, we get 

D 2 [X, | H^X,) £ [91,92]] = 2? 2 [aiiXf + M i | *(X?) £ [91,92]] = o\ x D 2 {X? | $(Xf) £ [91,92]]. 


To proceed, we need to compute the first two moments of the truncated normal distribution of X\. 
For transparency, we will show full proofs (compare [10, Section 13.10.1]). 

Let us calculate the conditional expectations E[X 1 \ X\ < x\ and E[X 1 | x < Xi < —x] for any fixed 
x £ (—oo,0). Since <f/{x) = —x0(x), we get 


E[X\ | Xi < x] = 


= 


<h(x) 

E[X 1 | x < Xi < -x] = 0. 

To get the corresponding second moments we integrate by parts. 


1 1-00 $(3.)' 


<h(x) 


E[X 2 | Ad < x\ = 


E[Xl | x < Xi < -x] = 


1 


<h(x) 
1 


enm = 


1 


<I>(x) 


$(x) 


(—x</>(x) + 4>(x)) = 1 — 


1 


1 - 2<F(x) 
1 

1 - 2<F(x) 


z 2 <Km = 


-mw-c 

X(j){x) 
d>(x) ’ 

1 




1 - 2<F(x) 


(2x(/)(x) + 1 — 2$(x)) = 1 + 


-^(0)1* 

2 X(j>{x) 

1 - 2$(x)' 




Therefore, 


D 2 [X 1 | Xi < x] = 1 - 


X(j){x) </>(x) 2 


$(x) $(x) 2 ’ 


^[Xi|x<x,<— x] = 1 + t ^|L. 

Since the conditional expected value behaves like a weighted arithmetic mean, we get that E[X 1 | Ad < x] 
is strictly increasing in x, while E[X 2 \ x < X\ < —x\ and E[ X 2 | Ai < x] are strictly decreasing with 
respect to x. Consequently, the central conditional variance D 2 [X\ | x < Xi < — x] is strictly decreasing. 
Next, we will show that the tail conditional variance Z? 2 [Xi | Xi < x\ is strictly increasing. Indeed, 


f-D 2 [X, | X, < *| = 
dx $(x) 

d>(x) 

f>{x) 

<I>(x) 


X 2 0(X) J(X) 2 [ ^{Xf 


$(x) 


2 , , 
x — 1 + X 


<&(x) <&(x) 

(j){x) 2 ' 


J $(x) 2 


x 2 - - 


$(x) <l)(x) 2 

IMV + 7 ^( x ) 2 

2 $(x); 4 4>(x) 2 


- 1 > 0 . 


The last inequality follows from the fact that since = —E[X\ \ X\ < x] is decreasing and positive, 
we get 


> 0(O) 2 = 2 4 

$(X) 2 - $(0) 2 7T 7' 
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Next, note that (compare [9, Lemma 8.1]) 


lim D 2 [Xq Xi < x] = 0 

x—>—oo 

and 

D 2 [X 1 | Xx < 0] = 1 - 

while 



lim D 2 [X l 1 x < Xi < — x] = 1 

x —>—00 

and 

lim D 2 [X! 1 x < Xi < 

x—>-0 

Hence there exists a unique x < 0 such that 



D 2 [X 1 | Xy< x\ = 

D 2 [X 1 

H 

A 

A 


Compare Figure 1 for visualization. 



Figure 1: The graph of conditional tail variance D 2 [X i | X\ £ i3(0, < 7 )] and conditional central variance 
D 2 [X 1 | Xi £ B(q, 1 — g)] as functions of q £ (0, 0.5), under the assumption Xq ~ Af(0, 1). 


Moreover, 

D 2 [X 1 | Xi < x] - D 2 \X 1 | x < X x < -x] 


1 _ x<j>(x) _ </>( x) 
<I>(x) <F(x) 

<h(x) 

<h(x) 2 (l — 2<I>(x)) 


2 2xcj)(x) 

2 1 - 2<F(x) 

(—xd>(x) — 4>(x)(l — 23>(x))), 


which shows that x is a (negative) solution of equation (5). Using basic numerical tools we checked 
that (5) is satisfied for x ss —0,8484646848, for which $(x) « 0,198089615. □ 

Theorem 1 provides an illustration to the empirical 20-60-20 rule. In particular we have shown that for 
any multivariate normal vector, this fixed ratio leads to a global equilibrium state, when dispersion and 
linear dependance measurement is considered. Nevertheless, please note, that the equality of conditional 
variances does not imply the equality of conditional distributions, as could be seen in Figure 1. 

Also, while linear dependance structure will be the same, the overall dependance in each subgroup, 
measured e.g. by the copula function [12], will be different. Indeed, for example it seems to be unwise 
to require the dependance structure in the best group, to coincide with the dependance structure in the 
average group. See Figure 2 for an illustrative example. 


Remark 1. The equilibrium level q calculated in Lemma 2 depends neither on p nor E. Therefore, if 
we consider correlation matrices instead of covariance matrices in (3), then the optimal value of q from 
Theorem 1 will also imply the corresponding equilibrium state, for correlation matrices. 1 

1 Please note we need additional assumption that .Y 1 is not independent of (X2, ■ ■ ■, X n ) as otherwise any q € (0,0.5) 

will satisfy (3) for correlation matrices instead of covariance matrices. 
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Figure 2: The conditional density function of the lower 20%, middle 60% and upper 20% of the standard 
normal distribution. The conditional variances for all three cases coincide. 



Figure 3: The conditional samples (upper row) and their conditional copula functions (lower row) from 
the bivariate normal with p = (0,0) and E = (j,j. where an = 021 = 1 and a 12 = &21 = 0.8. The 
conditioning is based on the the first coordinate and relates to the lower 20%, middle 60% and upper 
20% of the whole population. 


Remark 2. The value ||E[ 0 q j — Er g)1 _ g i||, for q pc 0.198 and some arbitrary matrix norm (e.g. Frobenius 
norm) might be used to test how far X is from a multivariate normal distribution. This test is particularly 
important, as it shows the impact of the tails on the central part of the distribution, as usually (for 
empirical data) the dependence (correlation) structure in the tails significantly increases, revealing non¬ 
normality. 

Remark 3. We can also consider more than three states, when clustering the population (e.g. having 
5 states we might relate to them as critical, bad, normal, good and outstanding performance, based on 
selected benchmark). The ratios, which imply equilibrium state (similar to the one from Definition 1) for 
5 and 7 different states are close to 

0.027/0.243/0.460/0.243/0.027 and 0.004/0.058/0.246/0.384/0.246/0.058/0.004, 
respectively. Those values could be easily computed using results from Lemma 1 and Lemma 2. 


4 Equilibrium for monotonic dependance 

In the definition of the equilibrium state (Definition 1) we have in fact measured the distance between 
conditional covariance matrices to compare the variability and linear dependance structure between the 
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groups. As explained in Remark 1, one could use conditional correlation matrices instead of covariance 
matrices and focus on the comparison of the linear dependance structure. Of course there are also other 
measures of dependance, which could be used to reformulate Definition 1. 

Among most popular ones are so called measures of concordance, where Kendall r and Spearman p 
are usually picked representatives for two dimensional case (see [12, Section 5] for more details). Instead 
of measuring the linear dependence, they focus on the monotone dependence, being invariant to any 
strictly monotone transform of a random variable (note that correlation is only invariant wrt. positive 
linear transformation). 

Thus, instead of covariance matrices E[q ? ], and in (3) we can consider the correspond¬ 

ing matrices of conditional Kendall r and conditional Spearman p, denoted by EL ,, ET 1 _ (j j, El _ q ^ 
and E^ 0 ^, Ej^ 1 _ ,, respectively. For comparison, we will also consider conditional correlation 

matrices, for which we shall use notation EL q ,, EL 1 _ and Eb^ q y 

Unfortunately, the analog of Theorem 1 is not true, if we substitute covariance matrices with the 
Spearman p or Kendall r matrices in Definition 3. Because of that we need different kind of notation for 
the equilibrium state, as stated in Definition 2. 

Definition 2. Let us assume that X is symmetric 2 and let n £ {r, p, r} 3 . We will say that a quasi-global 
balance (or quasi-equilibrium state) is achieved in X for n and q £ [0,1] if 


I yi K. y-i K, 

I ^[0,9] “ m 1-9] 


| f= inf | 

96(0,0.5) 


^[0,9] “ ^[9.1-9] I 


F- 


( 6 ) 


where || • ||f Is a standard Frobenius matrix norm given by 


|| A\\p '■= tr AA t 


n n 


N 


yy yy i a bi 2 > 


for any n-dimensional matrix A = {a,ij}ij = 

Similarly as in Definition 1, we will say that a global balance (or equilibrium state) is achieved in X 
for k and q £ [0,1] if the value in (6) is equal to 0. 

For transparency, we will write 

<f = argmin||E[ M - E[ 9 l _ g] || F , (7) 

<?€(0,0.5) 

q T = argmin||E[ M - S[ gil _ g] || F , (8) 

q€(0, 0.5) 

q p = arg min ||Ef 0 j - Ef 5 , [| F , (9) 

96 ( 0 , 0 . 5 ) 

to denote ratios, which imply quasi-equilibrium states given in (6). 4 

As expected, for X ~ A /”(p, E), the values q T and q p also seem to be very close to 0.2, for almost any 
value of p and E. To illustrate this property, we have picked 1000 random covariance matrices {E,} 4 £°° 
for n = 4 5 and computed the values of functions 


/;(g) = [|(E% i9] -(E% jl _ 9] || F , (10) 

/;(<?) = ||(E i )[ M -(E i )[ 9il _ g] || F , (11) 

= ( 12 ) 

To do so, for each i £ {1, 2,..., 1000} we have taken 1.000.000 Monte Carlo sample from X ~ A/"(0, S’) 
and computed values of (10), (11) and (12) using MC estimates of the corresponding conditional matrices. 
The graphs of /*, /} and f t for * = 1,2,..., 50 are presented in Figure 4. In Figure 5, we also present 
the smoothed histogram function of points idri, 1 - C ! c \ i<7rl} cl( i 0 and idf}}® 1 ) 0 , for which the minimum is 
attained in (10), (11) and (12) for i = 1 , 2 , ..., 1000. 

2 i.e. X is symmetric wrt. E[X] = (E[X 1 ],..., E[X n ]); note that it implies that E[q= Em _ 9j i] for any q 6 (0, 0.5). 

3 This will relate to the conditional correlation matrices, Spearman p matrices or Kendall r matrices, respectively. 

4 For simplicity, we use arg min and assume that the (quasi) equilibrium state exists and is unique. 

5 With additional assumption that correlation coefficients are bigger than 0.2 and smaller than 0.8, to avoid computation 








Figure 4: The graphs of functions /*, /* and /' for i = 1,2,..., 50, computed using 1.000.000 sample 
from A/”(0, E l ) and the corresponding estimates of conditional matrices. 





Figure 5: Monte Carlo density functions constructed using points {gi},-£ 60 , {<7j}££i 0 an d {9f}i£i°- For 
each i = 1,2,..., 1000 a 1.000.000 sample from Af(0, E * 4 * ) was simulated and the corresponding estimates 
of conditional matrices were used for computations. 


Unfortunately, in general the values q T and q p defined in (8) and (9) are not constant and independent 
of E. In particular, if the dependance inside X is very strong, e.g. the vector (Xi, X 2 , ..., X n ) is almost 
comonotone, then the values of q T and q p might increase substantially. 6 

To illustrate this property, let us present some theoretical results, involving conditional Spearman p 
and Kendall r. For simplicity, till the end of this subsection, we will assume that n = 2. 

Then, given X ~ A/”(/x, E), we know that a\ 2 = &21 = 7 ’ <J n (J 22 , where r € [—1,1] is the correlation 
between Xi and X 2 . It is easy to show (see [9]), that both unconditional and conditional values of 
Spearman p as well as Kendall r will depend only on the copula of X 7 , which is parametrised by the 
correlation coefficient. Thus, without loss of generality, instead of considering all p and E, we might 
assume that 

X = (Xl,X 2 ) ~ N(p, E) where p = (0,0) and E = 
for a fixed r € [—1,1]. 

Let P[ p ,q](r) and rr P;f/ i(r) denote the corresponding conditional Spearman p and Kendall r, given 
truncation interval B(p,q). Note that p\p, q ](r) and Tr p> 9 ](r) are odd functions of r. 


1 r 
r 1 


problems resulting from independence or comonotonicity, respectively (see also Remark 1). Note also that the sign of 

correlation coefficient is irrelevant, due to symmetry of X, so without loss of generality, we can assume that the correlation 

matrix is positive. Moreover, the values of q T and q p are invariant wrt. //, so we can set [i = 0 without loss of generality. 

6 Note that in our numerical example we have assumed that the correlation for any pair is between 0.2 and 0.8, excluding 
extremal cases. 

7 Note that the (conditional) Spearman p and Kendall r is invariant to any monotone transform of X\ or X 2 , and so is 
the copula function. 
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Lemma 3. For all 0 < p < q < 1 and r £ (—1,1), 


P[ P ,q](~ r ) = ~P[p,g]( r ) and r b,9](“ r ) = - T \ P ,q]( r )- 

Proof. Before we begin the proof, let us recall some basic facts from the copula theory (cf. [12] and 
references therein). We will use C r to denote the Gaussian copula, with parameter r £ (—1,1), which 
coincides with the correlation coefficient. Noting, that the copula could be seen as a distribution function 
(with uniform margins) let us assume that (U,V) is a random vector with distribution C r . We will 
denote by CL , the copula of the conditional distribution (U,V) under the condition U £ [ p,q ], where 
0 < p < q < 1. Due to Sklar’s Theorem we get the following description of CL : 


C\ 


[p,q\ 


C r (q , v) - C r (p , v)\ C r ((q — p)u +p,v)~ C r {p , v) 


q-p 


q-p 


u, v £ [0,1]. 


(13) 


Next, it is easy to notice, that the distribution function of (E/,1 — V) is equal to C r . Hence the 
Gaussian copulas commute with flipping, i.e. 


C r (u, v) = u — C r (u, 1 — v) for u, v £ [0,1]. 

On the other hand the flipping transforms the conditional distribution (U, V)\jj e i Piq ] to (U, 1 — V)\ue[p,q]- 
Hence we get 

C [pM ( U > V ) = U - C [p,q ] ( U > 1 - V). 

Thus basing on [12, Theorem 5.1.9], we conclude 

P[p.i] = ~P[p,q\( r )’ 

T [p,q](~ r ) = ^ r b,9]( r )’ 

□ 


We recall that the Spearman p and Kendall r of the conditional copula C^ p ^ are given by formulas: 

P[p,q\( r ) = p( c \p,q\) = -3 + 12 [ f C^ q] (u,v)dudv, (14) 

Jo Jo 

T \p,q] ( r ) = T ( C \p,q]) = ~ 1 + A jj [oi]2 C tp,q]( U ’ V ) dC ( 15 ) 

To describe their behaviour for small r we will need their Taylor expansions with respect to r. 
Proposition 1. For a fixed p,q £ (0, 1) (p < q) and r £ (—1,1), such that r is close to 0, we get 

P[p,q]( r ) = (^(V2x 2 ) - ^{V2xi) - (q-p)^Tr((p{x 1 )+ (p(x 2 ))^+0{r 3 ), (16) 

T [p,q] ( r ) = ^P\p,q](r) +0{r 3 ). (17) 

where X\ = <f’^ 1 (p) and X 2 = < f>~ 1 (g). 

Proof. We will use notation similar to the one introduced in Lemma 3. The proof will be based on two 
facts. First, for r = 0 both C and are equal to product copula n(u,u) := uv, i.e. 

C°(u,v) =uv = C^ M (u,v). 

Second, the derivative of the distribution function of a bivariate Gaussian distribution having standardised 
margins with respect to the parameter r is equal to its density, which implies 

dC r (u,v ) 1 / 4> -1 (u) 2 + 4>~ 1 (u) 2 — 2r$ _1 (u)$ _1 (u) \ 

d~r = 2WI^ eXP ' 2(1-r 2 ) )' 
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We calculate the Taylor expansion of P\p, q ](r) at r = 0. 

P[p, 9 ](o) = p( n) = 0 . 


^P[p,q] ( 0 ) 
dr 


= 12 


1 r 1 

dr 


dit du. 


o Jo 


The derivative of C^ p q j will be calculated in two steps. First we differentiate formula (13). We get 

cr{q,v)-cr{p,v) 

q-p 


fr C \p,q] 

+ ^ 2 ^lp,q] ( U > 


C r (q,v)-C r {p,v)\ 1 fdC r (q,v) dC r {p, v) 


q-p 


q — p \ dr 


dr 


1 / dC r ((q — p)u+p,v) dC r (p,v ) 


q-p 

Next, setting r = 0, we obtain 


dr 


dr 


9 C [p,q] ( u ’ v ) = & ( $ 1 ((<1-P) U +P)) ~u<p($ 1 {q)) - (1 -u)<p($ l (p)))<p($ 1 (v)) 


dr 

Finally, we get 
dp q , p ( 0) _ 12 


[ j 1 ((q - p)u + p)) - uip($ 1 (q)) - (1 - u)<p(& 1 (p)))tp($ 1 (u))dndu 

Jo Jo 

I (ip($ _1 ((g — p)u + p)) — utp(^~ 1 (q)) - (1 - u)<£(<I> _1 (p))) du / </?(<b _1 (u)) du 

q-p Jo Jo 

= —A= ( —^(‘Kv^- 1 (<z))-‘Kv^- 1 (p))) - U<p{*- 1 {q)) + 'P(*~ 1 m) ■ 

q -p 2^71 \q -p 2 ^/tt 2 ) 

The proof of the Kendall r case follows from the symmetry 

jj C\ dC 2 = jj C 2 dC\. 

We have 

f)n-.. . (r} r) C C 

C r b>M (u,v)dC r M (u,v) = 8 


dr 


q-p Jo Jo 
12 


9T q ,p(r) = d_ 
dr dr 


Setting r = 0 we get 

dT q ,p{ 0) = 

dr 


[o,i]= 


[ 0 , 1 ]= 


^ C [p,q\^ V ) AC \p,q]^ V ) = ■ 


[ 0 , 1 ]- 


8 dp q>p ( 0) 


[ 0 , 1 ] 

-^ C lP,g]( u ’ v ) d«du = 


12 dr 


□ 


For k denoting either p or r, using Proposition 1, we are now ready to compare values of «[o,g](p) 
and K[ q ,i_ q ](r), changing both q £ (0,0.5) and r £ (—1,1). Note that for n = 2 the equilibrium state 
corresponding to (9) is achieved, if and only if K[ 0i9 ](r) — K[q il _ g ](r) = 0. In [9, Theorems 4.1 and 4.4], it 
was shown that for any fixed r > 0, the conditional copulas CL are increasing in q while CL 1 _ , are 
decreasing in q. Hence the differences 

A p (g,r) = p [0 ,q\{r) - p [qA - q] (r) and A T (q,r) = r [0i9] (r) - r [g4 _ g] (r) (18) 

are strictly increasing in q and changing the sign. Using Lemma 3 we know, that for each r £ (—1,1), 
such that r ^ 0, there exists exactly one q £ (0,0.5) for which A p (q, r) = 0 and one q £ (0, 0.5) for which 
A T (q, r) = 0. Let 

A k : (-1,1) ->■ (0,0.5), K = p,r, 

be a function, which assigns appropriate q for any r ^ 0, and let H K (0) = liminf t _j.o A K (t) 8 . We will now 
show that the graphs of A p and A T are orthogonal to the line r = 0. 

8 Note, that for r = 0, any q £ (0, 0.5) implies equilibrium state, the reason we define /\(0) in that way. 
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Theorem 2. For r close to 0, we get 


A p (r ) = A T (r) + 0(r 2 ) = q* + 0(r 2 ), 

where q* ss 0, 2132413 is a solution of the following equation 

(1-4 q + 6g 2 )$(v / 2$" 1 (g)) - g(l - 6g + 8g 2 )v^($ _1 (g)) - g 2 = 0. 

Proof. If r = 0, then for any q £ (0,0.5), we get that (18) is equal to 0, so for clarity we might set 
A k (0) = q*. Using Lemma 3, without loss of generality, we might assume that r > 0. Due to Proposition 
1, for small r, we get 

P[o,q](r) - P[ q ,i- q ](r) = ^<r 2 ($(v / 2$- 1 (g)) - (<?))) + 0(r 3 ) 

- r^( 1 - 2g) 2 (l - 24>(v / 2$“ 1 (g)) - 2(1 - 2g) v / ^>($- 1 (g))) + 0(r 3 ) 

= r 2 g )2 ((1 - 4g + 6g 2 )$(v / 2$" 1 (g)) - g(l - 6g + 8g 2 )^($“ 1 (g)) - g 2 

+ 0(r 3 ) 

and a similar formula for r. □ 

In particular, Theorem 2 implies that A p (0) = A r ( 0) = g*. Using basic numerical calculations, we 
get for k denoting p or r 

0.213 < A K {r) < 0.271, 

for any r £ (—1,1). Nevertheless, usually this bond is much tighter, which could already be observed in 
our previous numerical example (see e.g. Figure 4). With some easy calculations, we get 

0.213 < A K (r) < 0.230, 

for r £ (—0.9, 0.9). The graph of function A p (q,r) = P[o, q ]{r) — P[ q p- q ](r) for various fixed values of 
q £ (0,0.5) is presented in Figure 6, see also Figure 7 for the corresponding graph of A r . 

Remark 4. When we consider the equilibrium state for conditional Spearman p matrices (or Kendall 
t), we only need to know the dependance structure of X, given by it’s copula. Thus, we can set any 
marginal distributions of ... ,X n , without changing the equilibrium. This allow us to consider much 
more general class of multivariate distributions, for which the 20-60-20 rule will hold. 



Figure 6: The graph of A p (g, r) = P[o, q ]{r) — p[ q ,i- q ](r) as function of r for different values of (fixed) g. 
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Figure 7: The graph of A T (q, r) = T[ 0 i 9 ](r) — Tr gil _ ? i(r) as function of r for different values of (fixed) q. 

5 Abandoning Gaussian world 

When we loose the assumption that X ~ A f(n, E), the existence of equilibrium is no longer guaranteed. 
A natural question is if for any elliptical distribution the equivalent of 20/60/20 rule holds. In this section 
we will discuss this matter shortly. 

We say that X has the elliptic distribution if it can be defined in terms of a characteristic function 

<l>x(t)=e it '^(t'Yt), (19) 

where g is a vector (which coincides with mean vector, if it exists), E is a scale matrix (which is pro¬ 
portional to covariance matrix, if it exists) and ’F is so called characteristic generator of the elliptical 
distribution (cf. [6] and references therein for a general survey about elliptic distributions). For simplic¬ 
ity, we will use so called stochastic representation of an elliptic distribution. It is well known (see [6]) 
that if X has the density, then it is elliptic if and only if it can be presented as 

x = n + Vy.ru, 

where VY is any square matrix such that VY VY = Y (e.g. obtained using Cholesky decomposition), U 
is an n-dimensional random vector, uniformly distributed on the unit n-sphere, and R is a nonnegative 
random vector, corresponding to the radial density, independent of U . Moreover, we will assume that 
the first two moments of R exists, which ensures the existence of mean vector and covariance matrix of 
X. Now we can ask, if for given U and R the equilibrium state of X always exists and if it is invariant 
wrt. /i and E. 

Unfortunately, it is easy to show, that the equilibrium state (with covariance matrices) is not always 
achieved and the quasi-equilibrium state might strongly depend on E, even when we consider only the 
class of multivariate t-student distributions (i.e. we can consider appropriate radial distributions and 
covariance matrices in Algorithm 1). 

On the other hand, if we substitute covariance matrices with correlation matrices in (3), then we are 
able to prove the results similar to Theorem 1 for a much more general class of elliptic distribution. 

To illustrate this property, we have conducted simple computational experiment, using multivariate 
t-student distribution, as it is commonly used by practitioners. Assuming n = 4, for any v £ {2,3,..., 20} 
we have picked 100 random matrices E/ and for each i— 1,2,..., 100 we simulated 1.000.000 Monte Carlo 
sample, assuming X ~ t„( 0 , E}). Next, we have calculated the values of q l v £ (0,0.5), for which (quasi- 
equilibrium state is attained (i.e. for estimates of conditional correlation matrices; see Algorithm 1). In 
Figure 8 we present the graph of 0.1, 0.5 and 0.9 quantiles of the sample {<?/}}£° for v = 2,3,... ,20. 
The value of q for which (quasi-)equilibrium state is achieved clearly depends on the degrees of freedom 
increasing to value 0.198, which coincides with equilibrium state for multivariate normal distribution (i.e. 
note that t-student distribution converge to normal distribution, when v —> oo). 
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Algorithm 1 Compute quasi-equilibrium state for elliptic distribution 

Require: 

n £ N+ - dimension 
N £ N+ - size of Monte Carlo sample 

radial.Dist - radial distribution (e.g. - \Jx { n ) f° r multivariate normal) 
l: procedure Equilibrium^,A, radial.Dist) 

2: Generate U: N independent samples from n-dimensional unit sphere (uniform density) 

3: Generate R: N independent samples from (univariate) radial.dist 

4: Generate E = {ay,}: n x n scale matrix (proportional to covariance matrix) 

5: while jmin t7 y (of^/lcryayy < 0.2 V { max,^ 3 > 0.8 j do 

6: Generate (new) E = {ay,}: n x n scale matrix 

7: end while 

8: Compute VE, e.g. using Cholesky decomposition 

9: Compute A = {A^} = (\/E) RU (i.e. matrix n x TV; random sample from elliptic distribution) 

10: Define function DiST(g), for q £ (0,0.5) 

11: function DiST(g) 

12: Compute q 1 , sample lower g-quantile of 

13: Compute q 2 sample lower (1 — g)-quantile of 

14: Compute conditional tail sample X 1 , by selecting all 1 < k < N, for which Xu- < q 1 

15: Compute conditional central sample A 2 , by selecting all 1 < k < N, for which q 1 < Ajy < g 2 

16: Compute E[o, g ], a (conditional) covariance matrix of X 1 

17: Compute E[ g l _ g ], a (conditional) covariance matrix of A 2 

18: Compute d = ||E [0j<?] - E [?il _ g] || F 

19: return d 

20: end function 

21: Compute q = argmin 0< g <0 5 DiST(g) 

22: return q 

23: end procedure 


14 








Figure 8: The graph of 0.1, 0.5 and 0.9 quantiles of {g* }J£° for v = 2, 3,..., 20. 
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